-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add fine-tuning LLM to generate Persian product catalogs in JSON format #86
feat: add fine-tuning LLM to generate Persian product catalogs in JSON format #86
Conversation
…ence jupyter notebook
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
fix: add "execution_count": null
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Hello, thanks for your contribution! I think this recipe may be a bit too generic for the cookbook because it doesn't have a specific focus on a real-world application (biomedical, legal, finance domains) or showcase a new technique. One way I think we can improve this recipe is if you show how to finetune the model on domain-specific data. |
…b to fine_tuning_llm_for_generate_persian_product_catalog_in_json_format.ipynb
…ormat.ipynb to fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Hello. I didn't realize the focus was on specific domains. Interestingly, I have worked on unstructured data of Iranian products in a well-known Iranian marketplace and was able to obtain JSON output with an LLM. Therefore, I have now modified this Jupyter notebook for this task. |
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
Cool pivot to showcasing a use case for your recipe! 👏 I think what will make it even more impactful is to generate an example so users can see the output. |
…format.ipynb feat: add an example and remove importance level
I deleted the importance level and also wrote an example from the model we had developed on this task to make the output tangible. |
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
One more little thing to fix, otherwise looks good to me! |
…format.ipynb Change example output from code cell to markdown cell
I've moved the example output to a markdown cell as suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,603 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so many mix of active and passive voice, avoiding passive voice makes it so readable, maybe you could rewrite for clarity?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could actually remove this entire paragraph since this notebook doesn't show how to do alignment.
@@ -0,0 +1,603 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These a re hyperparameters, maybe you can remove comments and explain what each block is for in a markdown cell here and add comments for granular ones, i.e. walk reader through what you're doing and how you're doing it?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is resolved yet. You can break up this large cell block into smaller blocks and give a brief description of what's happening inside each block.
For example, the paragraph where you describe LoRA parameters can be placed before a LoRA parameter block. Likewise, you can split off the QLoRA block and add your sentence about QLoRA there.
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,603 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not resolved yet. I think Merve meant putting it at the bottom of the notebook so readers don't get distracted.
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
notebooks/en/fine_tuning_llm_for_generate_persian_product_catalogs_in_json_format.ipynb
Outdated
Show resolved
Hide resolved
…format.ipynb resolve all merveenoyan hints and suggestions
I tried to follow all the points. If there is anything else, tell me. |
Is it ok? |
Hello. I am still waiting for your approval and merge, or any feedback you may have. If there are any issues, please let me know so I can fix them. |
Hello. Does this PR still have a point to merge? If you tell me, I will fix it. Because no matter what I asked Mrs. merveenoyan, she did not answer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay!
I left a few more comments about some unresolved issues.
…format.ipynb to fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format.ipynb
I've tried to incorporate your feedback and address the issues. If there are still any problems, please let me know. |
@@ -0,0 +1,779 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: *Model Training
Also would be nice to walk people through what you're doing when you load bnb config, lora config and more (we try to assume the reader isn't very knowledgeable)
Reply via ReviewNB
@MrzEsma Hello, I just left a very general comment, we can merge once addressed :) |
Hello. I tried to review the notebook again and explain any concept that is specific to fine-tuning language models. Once again, if you think there's something I haven't explained, please mention it. Thank you. |
…ormat.ipynb edit our HF link.
What does this PR do?
Hello. In this PR, I tried to create a notebook that fine-tune an LLM as "Hello World" in the simplest way. In addition, to explain the concepts used in this path, I would like to refer to the existing valuable blogs that explained it well and simply. Finally, I also brought a simple code example for fast inference.
Who can review?
@merveenoyan and @stevhliu